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In this work we consider the problem of preparation of the stationary distribution of irreducible, 
time-reversible Markov chains, which is a fundamental task in algorithmic Markov chain theory. For 
the classical setting, this task has a complexity lower bound of 17(1/5), where 5 is the spectral gap 
of the Markov chain, and other dependencies contribute only logarithmically. In the quantum case, 
the conjectured complexity is 0(V <5 _1 ) (with other dependencies contributing only logarithmically). 
However, this bound has only been achieved for a few special classes of Markov chains. In this work, 
we provide a method for the sequential preparation of stationary distributions for sequences of 
general time-reversible N— state Markov chains, akin to the setting of simulated annealing meth¬ 
ods. The complexity of preparation we achieve is 0(V5~ 1 N 1 ' 4 ), neglecting logarithmic factors. 
While this result falls short of the conjectured optimal time, it still provides at least a quadratic 
improvement over other straightforward approaches for quantum mixing, applied in this setting. 


I. INTRODUCTION 

Quantum random walks exhibit features that can be significantly different to their classical counterparts. As 
a famous example, the hitting time, which is a fundamental quantity in the theory of random walks, can be 
exponentially reduced if so-called coined quantum walks are employed [1]. However, such strong results are only 
known to hold for a few special classes of undirected random walks. Alternative approaches to quantization of 
random walks over more general graphs, in which case we talk about Markov chains (MCs), most often aim at 
more modest polynomial improvements. Using the so-called Szegedy-type quantum walks, a generic quadratic 
improvement in hitting times [2] was shown for all time-reversible MCs 1 . The generality of the setting, while 
preventing superpolynomial speedups, compensates with its greater applicability. Early on, related approaches 
have e.g. provided a basis for a quadratic improvement of algorithms for element distinctness [3], element 
detection [4] and the triangle problem [5]. Setting aside hitting times, quantum walks have been investigated 
for their capacity to speed up mixing processes, that is, the task of preparing stationary distributions of a 
given MC. This task constitutes another fundamental problem of Markov chain theory. Efficient mixing is, for 
instance, important in the context of Markov Chain Monte Carlo (MCMC) algorithms. MCMC methods are, 
for instance, central to many algorithmic solutions to hard combinatorial problems and problems stemming 
from statistical physics [6]. Quantum improvements in this context have already been reported [7-12]. Beyond 
MCMC-related applications, efficient mixing also extends the applicability of the aforementioned quantum 
hitting time speedups, as the preparation of the relevant stationary distributions is sometimes assumed to be 
an affordable primitive [2, 13]. However, despite the considerable interest, the quantum speedup of mixing 
processes has only been shown for certain classes of MCs [7, 14-17], and it is an open conjecture that a generic 
quadratic speedup for mixing can be obtained for all time-reversible MCs [7]. For a recent review on quantum 
walks see e.g. [18]. 

In this work we consider the problem of sequentially generating stationary distributions of sequences of slowly 
evolving Markov chains, illustrated in Fig lb. This setting is similar to the scenario of simulated annealing, in 
which case quantum improvements have already been achieved [11, 12, 19]. There is, however, a key distinction 
between the annealing settings and ours: in annealing settings, the target is to produce a sample from the 
stationary distribution of the final chain only, whereas the intermediary chains have only an accessory role. In 
contrast, in our case, we must produce samples sequentially, for each chain in the sequence (and, indeed, the 
sequence can in principle be infinite). The motivation for this problem stems from recent work in artificial 
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1 The guaranteed quadratic improvement is shown, provided only one target element exists. 



Tbnit 


a) 


MCi 


7Ti 


mc 5 


vr 2 


7T t -l 


MC/ 


•TTi 


time-steps 


IN 


Tinit-*■ 

MCi 

. * 

mc 2 

. ■> . » 

MC t 

. > 

MQ+i 

b) 








II II 

OUTS 7Tl 7r 2 7Ti 7T t+ i 


FIG. 1. Standard simulated annealing is presented in part a) of the figure: at each time-step k, we produce a sample from 
the distribution nk which need not be exactly the stationary distribution of the Markov chain MC*,. This is used as the 
initial distribution for the next Markov chain. However, the last sample is distributed (approximately) according to 7r t 
which is the stationary distribution of MCt and the target distribution. Part b) of the figure represents our setting: the 
sequential sampling from a slowly evolving sequence of Markov chains. At each time-step k, we are required to produce 
an element sampled from 7 r*,, which is a good approximation of the stationary distribution of the Markov chain MC*,. 
The sequence need not terminate, or it may be arbitrarily long. 


intelligence (AI) [20], by the authors and other collaborators, but may have broader applicability. We will 
comment on this further later. 

For our problem, we first identify two classes of Markov chains, characterized by the distance of their sta¬ 
tionary distribution from the uniform distribution. These two classes cover all discrete time-reversible Markov 
chains, and for both classes mixing can be achieved in time 0(V <5 —1 AT 1 / 4 ), neglecting logarithmic terms. The 
methods used for mixing differ for the two classes, and the second technique (utilized when the target distri¬ 
bution is, in a sense we specify later, far from the uniform distribution) requires additional information about 
the underlying Markov chain. In particular, it requires a small number of samples from the very underlying 
stationary distribution we seek to construct. While this additional information cannot be straightforwardly 
recovered given just one MC, we show that in the context of slowly evolving Markov chains, it can. 

The structure of this paper is as follows. In Section II we present related work and clarify the distinction 
between our and previously studied settings. Following this, in Section III we cover the preliminaries and 
introduce all the (sub)-protocols required for our main result. Finally, in Section IV we give our main result, 
and finish off with a brief discussion in Section V. 


II. RELATED WORK 

The setting of slowly evolving MCs is especially relevant in the pervasive simulated annealing methods. In 
MCMC methods in general, the task is to produce a sample from the stationary distribution of some target MC 
P T . For concreteness, this can be the Gibbs distribution a T of a physical system at a target (low) temperature 
T. Markov chains which have <tt as the stationary distribution are easy to construct, but, in general, the 
mixing time required to achieve stationarity is prohibitive. Better results are often achieved by using simulated 
annealing methods, in which one constructs a sequence of MCs Pi ,..., P t = Pt , which, for instance, encode the 
Gibbs distributions at gradually decreasing temperatures. The choice of the temperature-dependent sequence is 
often referred to as the annealing schedule. The fact that the temperatures decrease gradually ensures that the 
stationary distributions of neighboring chains are close, so the sequence is slowly evolving. As the temperature 
corresponding to Pi is high, the stationary distribution of Pi is essentially uniform, and Pi mixes rapidly 
(effectively in one step). Simulated annealing is then realized by sequentially applying the chains Pi to P t 
to the initial distribution. In this process, no individual chain fully mixes, but nonetheless, often the reached 
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distribution approximates the target distribution well, even when the number of steps t is substantially smaller 
than the mixing time of Pt itself. 

Quantum variants (and generalizations) of the classical annealing approach have been previously addressed 
in, for instance, [11, 12, 19]. There, the so-called Szegedy walk operators are employed instead of the classical 
transition matrices P t . The approaches differ, with one commonality: at each time-step, the quantum state 
obtained from the previous step is used in the subsequent step, and thus quantum coherence is maintained 
throughout the steps of the protocols. 

Our setting is inspired by a recent result by the authors and other collaborators where Szegedy-type quantum 
walks are used in problems of AI [20]. In the so-called reflective Projective Simulation (rPS) model of artificial 
intelligence, at each time-step i, the target action of an rPS agent is encoded in the stationary distribution of 
a MC Pt which is gradually modified as the agent learns through the interaction with the environment. The 
agent’s action, which is chosen by sampling from this distribution, has to be output at each time-step. For more 
details on the Projective Simulation model for AI, we refer the reader to [20-22]. Viewed abstractly, in this 
setting we have an, in principle, infinite sequence (a stream) of MCs P±, P 2 ,. ■ ., Pt, ■ ■ ■ which is slowly evolving. 
At each time-step t, we are required to produce an element sampled according to the stationary distribution 
of P t 2 . In contrast, in simulated annealing, the sequence is finite, and we are only required to produce a 
sampled element distributed according to the stationary distribution of the last MC. The quantum approaches 
to simulated annealing cannot be straightforwardly applied to our setting, as this would require measuring the 
quantum state at each step. This would prevent all the quantum speedup. Alternatively, the sequence would 
have to be re-run from the beginning at each time-step, which is not acceptable as the sequence can be of 
arbitrary length. The differences between the two settings are illustrated in Figs, la and lb. It is worthwhile 
noting that even the classical simulated annealing methods do not immediately help with our task. In classical 
simulated annealing, at each time step t we are dealing with a classical sample (corresponding to step t) which 
can be copied. However, one cannot output the classical sample at time - step t. and use it as a seed for the 
next time-step: this would induce correlations between the samples at different time-steps whereas we require 
independent samples [23] 3 . 


III. PRELIMINARIES 

In this section, we will set up the notation and define the basic tools we will employ throughout this paper. 
Part of the presentation is inspired, and closely follows, the approach given in [13]. 

The basic building block we will use in this work is the so-called Szegedy walk operator W(P), defined for any 
ergodic, aperiodic and time-reversible Markov chain P. First, we will briefly recap a few basic notions regarding 
Markov chains for the convenience of the reader, and refer to [24] for further details. Throughout this paper, 
with P we will denote a left-stochastic matrix (a matrix with non-negative, real entries which add up to one 
in every column). As P, along with an initial distribution, specifies a Markov chain, we will refer to P as the 
transition matrix and the Markov chain, interchangeably. If P is irreducible and aperiodic, then there exists 
a unique stationary distribution 7 r, such that Pn = n 4 . Here, n denotes a distribution over the state space, 
represented as a non-negative column vector 7r = ( 7 r*)|^ 1 , 7 r, £ Kg', such that J2i n i = 1- If 7r is a distribution, 
then we will refer to the element which occurs with a largest probability i m ax = argmax j; 7r, as a mode of the 
distribution 7r, and the corresponding largest probability 7r max = max,;7r,; as the probability of a mode. Note 
that while the mode need not be unique, the probability of the/a mode is. 

The final property we will require is that the Markov chain P is time-reversible, that is, that it satisfies 
detailed balance: an ergodic Markov chain P with stationary distribution 7 r is time-reversible if the following 
holds: 

TtiPij — TTjP/'ijV 7,J. (1) 

More generally, for an ergodic Markov chain P, over the state space of size N with stationary distribution 
7 r, we define the time-reversed Markov chain P* with P* = D(n)P T P(7r) _1 , where D is the diagonal matrix 
D = diag(iTi ,..., 7 r/v )- 5 Then, P is time-reversible if P = P*. 


2 For completeness, in the rPS model the agent actually needs to produce a sample from a renormalized tail of the stationary 
distribution, which can have very low cumulative weight, making the process very slow. To resolve this problem, we have 
employed a quantum approach in [20]. 

3 This problem can be circumvented by letting each MC from the sequence fully mix. However, in this case we lose any advantage 
of simulated annealing, and just perform brute-force mixing at each time-step. 

4 In this work we will adhere to the convention in which the transition matrices are left-stochastic, and act on column-vectors from 
the left. 

5 The inverse of D always exists, as stationary distributions of irreducible aperiodic Markov chains have non-zero support over the 
entire state space. 
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Next, we review the basics of so-called Szegedy-type quantum walks, to an extent inspired by the presentation 
given in [13] 


A. The Szegedy walk operator 

While the Szegedy walk operator W (P) can be defined directly, it will be useful for us to construct it from a 
more basic building block, the diffusion operator Up. The diffusion operator Up acts on two quantum registers 
of N states, (partially) defined as follows: 


N 

Up |i)j |0) II = |z)j \/Pji li)n ■ (2) 

f=i 

The operator Up is a natural quantum analog of the operator P in the sense that a classical random walk can 
be recovered by applying Up, measuring the second register, re-setting the first register to 10), and swapping 
the registers. While Up is not uniquely defined, any operator satisfying Eq. (2) will do the job. 

The operator Up, and its adjoint are then used to construct the following operator: 

ref{A) = U P (t l ®Z ll )U ] P , (3) 

where Z = 2 |0) (0| — 1, reflects over the state |0). The operator ref(A) is itself a reflector, reflecting over the 
subspace A = span({[/p |i) |0)}*). The Szegedy quantum walk is often explained as a bi-partite walk between 
two copies of the original graph, and ref(A) corresponds to one direction. The other direction is established 
by defining the diffusion operator in the opposite direction: Vp = SWAP\j\ Up SWAPiji, and proceeding 
analogously as in the case for the set A, to generate the ref(B) operator, reflecting over B = span({Vp |0) |j)}y)- 
The Szegedy walk operator is then defined as W(P) = ref(B)ref(A). In [4, 13] it was shown that the operator 
W(P) and P are closely related, in particular in the case when P is time-reversible. 

Often we will be referring to the coherent encoding of a distribution 7r, denoted |7r). The state 17r) is a pure 
state of an N —level system given by |7r) = V ni I*) • ^ is clear that a computational basis measurement 

(so a projective measurement w.r.t. the basis {|z)}i) of the state |7r) outputs an element distributed according 

tO 7T. 

In the context of Szegedy-type quantum walks, it is convenient to define another type of a coherent encoding, 
relative to a Markov chain P, which we temporarily denote |7r'). This encoding is defined by |7r') = Up |7r)j(g) |0) n , 
where Up is the Szegedy diffusion operator. It is easy to see that |7r) and |7r') are trivially related via the diffusion 
map (more precisely, the isometry 17r) —> Up |tt) (g) 10)) and moreover that the computational measurement of 
the first register of |7r') also recovers the distribution 7r. Due to this, by abuse of notation, we shall refer to both 
encodings as the coherent encoding of the distribution 7r, and denote them both with |7r), where the particular 
encoding will be clear from the context. However, for the majority of the text, we will be using the latter 
meaning. 

With these definitions in place we can further clarify the relationship between the classical transition operator 
P and the Szegedy walk operator W(P). Let n be the stationary distribution of P, so Ptt = n. Then the coherent 
encoding of the stationary distribution n of P, given by |7r) = Up ]>T | i) |0), is also a +1 eigenstate of W ( P ), 

that is, W(P) |7r) = 17r). Moreover, in the subspace A+B, so-called busy subspace, it is the unique +1 eigenstate. 
On the orthogonal complement of the busy subspace, W (P) acts as the identity. 

Moreover, the spectrum of P and W (P) is intimately related, and in particular the spectral gap 

5 = 1— max |A|, (4) 

where A denote the eigenvalues of P and cr(P) denotes the spectrum of P, is essentially quadratically smaller 
than the phase gap 

A = min {2|0||e ie £ a (W (P)), 6 ± 0} , (5) 

where 9 denote the arguments of the eigenvalues, i.e. eigenphases, of W{P). This relationship is at the very 
basis of all speedup obtained from employing Szegedy-type quantum walks, which we shall elaborate further. 
In this paper we will not use other results than those we briefly exposed here, and we refer the interested reader 
to [2, 13] for further details. 
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B. 7r) projective measurement 

The first application of the walk operator W ( P ) allows us to approximate a projective measurement on the 
| 7 t) state, where n is the stationary distribution of P. 

This is achieved by using Kitaev’s phase detection algorithm 6 [25] on W(P) (with precision 0(l/y/8)), which, 
if followed by the measurement of the phase-containing register, approximates the projective measurement on 
the state |7r). To understand why this holds, recall that the W(P) operator has the state 17r) as the unique 
+1 eigenstate, in the busy subspace. Moreover the values of the phases of all other eigenstates (in the same 
subspace) are at least A. 

Thus, provided the state we perform the measurement on is in A + B, the residual state, conditioned on 
detecting zero phase, is a good approximation of |7r). The error can be further suppressed by iterating the 
procedure, as was suggested in [13], there for the purpose of approximate reflection, which we will elaborate on 
next. More precisely, the errors can be made exponentially small with linear overhead, yielding an overall cost 
0(l/y/S). Here, the O , the so-called soft-0 notation ignores the logarithmically contributing factors, in this 
case stemming from the quality of the approximation. This result can be seen as a consequence of Theorem 6 in 
[13]. This is a very useful tool for ’purifying’ an already good approximation of the target state |7r). However, 
this projective measurement behaves correctly only if we are guaranteed the state we have is in the space A + B. 
Fortunately, this is easy to achieve. In particular, testing whether a given state is in A (or B ) is straightforward: 
one simply applies Up (or Vp) and checks the contents of the second (or first) register. Provided we observe the 
state |0), we are guaranteed that we are in the correct subspace. Since the target state |7r) is in A, it will suffice 
to check whether the initial state is in A first and if it is perform the \t:) projective measurement. The sequence 
of these two measurement (A membership measurement, followed by the phase measurement) constitutes the 
17r) projective measurement. The success probability of this measurement, applied on the pure state | ip) is in 
0(F(\tp) , |7r))), that is on the order of the fidelity F(\i(>), |tt)) = | (ip\ 7r)| 2 between the input state and the |-7r) 
state. Note that if the measurement were perfect, the success probability would be exactly the fidelity. 


C. Approximate reflection over 17r) 

One of the central tools in the theory of Szegedy-type quantum walk is the so-called Approximate Reflection 
Operator ARO(P) ~ 2 |7r) (7 t| — 1, which approximately reflects over the state 17r) [13]. The basic idea for the 
construction of this operator is similar to the one we gave for the |7r) projective measurement. By applying 
Kitaev’s phase detection algorithm on W(P) (with precision 0(log(A))), applying a phase flip to all states 
with phase different from zero, and by undoing the phase detection algorithm, we obtain an arbitrarily good 
approximation of the reflection operator R(P ) = 2 17r) (7r| — 1, for any state within A + B. The errors of the 
approximation can be efficiently suppressed by iteration (by the same arguments as for the |7r) measurement) 
[13], so the cost of the approximate reflection operator is again in 0(1/A) = 0(1/VS). 

Thus, the second gadget in our toolbox is the operator ARO(P), which approximates a perfect reflection 
R(P) on A + B, while incurring a cost of 0(1/y/8) calls to the walk operator W(P). 

The operator ARO(P) is central to many of the results employing Szegedy-type walks [2, 13], in particular 
in tasks of element finding, as we shall clarify next. 


D. Element searching and unsearching 

The approximate reflection operator ARO(P), along with the capacity to flip the phase of a chosen subset of 
the computational basis elements, suffices for the implementation of an amplitude amplification [26] algorithm. 
This, in turn, allows us to find the chosen elements with a quantum speed-up. To illustrate this, assume we 
are given the state |7r), the (ideal) reflector R(P), and assume we are interested in finding some set of elements 
M C {1,..., iV}. The subset M is typically specified by an oracular access to a phase flip operator defined with 
Zm = 1 — 2 J2ies I*) <*l- The e l emen t searching then reduces to iterated applications of ZmR(P) (which can be 


6 The original algorithm by Kitaev allowed the estimation of the eigenphases of a given operator, where the final step is an inverse 
quantum Fourier transform (QFT T) on the phase containing register. This algorithm can be, for our purposes, further simplified 
by substituting the QFT' with the suitable number of Hadamard gates, as suggested in [12]. This substitution maintains the 
probability of observing a ’zero’ phase, and the corresponding post-selected state, thus can be used to detect a non-zero phase. 
For this reason, this slightly tweaked algorithm is called the phase detection algorithm. 
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understood as a generalized Grover iteration, more precisely amplitude amplification) onto the initial state 17r) . 
Let 7 i denote the conditional probability distribution obtained by post-selecting on elements in M from 7r, so 


—, if i e M 
e 

0, otherwise, 


( 6 ) 


with e = Tr Let \i t) = Up I*) |0) denote the coherent encoding of n. Note that the measurement 

of the first register of |if) outputs an element in M with probability 1. Thus the capacity for preparing this 
state implies that the desired element from M can be found, directly by measurement. 

As it was shown in [13], applications of Zm and R(P) leave the register state in the two-dimensional subspace 
span({|7r), |tt)}) and moreover, using 0(l/y/e) applications of the two reflections will suffice to produce a state 
\ip) € span{|7r) , |7r)} such that | (-0| ff) | 2 is a large constant. Measuring the first register of such a state will 
result in an element in M with a constant probability, which means that iterating this process k times ensures 
an element in M is found with an exponentially increasing probability in k. However, since the state \ip) is also 
in span{|7r), |7r)}, it is easy to see that the measured outcome, conditional on being in the set M, will indeed 
be distributed according to n. 

In our recent work [20], and also in [2], these results were used to produce a sample from the truncated 
stationary distribution tt. in time 0(1/y/e) x 0(l/yfS) where the 5 term stems from the cost of generating the 
approximate reflection operator ARO(P), and 0(l/yfS) corresponds to the number of iterations which have to 
be applied. This is a quadratic improvement relative to using classical mixing, and position checking processes 
which would result in the same distribution. 

However, the same process can be used in reverse to generate the state |-7r) starting from some fixed basis 
state | i') = Up |i) |0) with cost 0(1 /y/5) x 0(l/y/rTi). Note that 7ri = | <7r| i')\ 2 is the probability of sampling the 
element i from the distribution tt. To see that this works, let W tot correspond to the product of all R(P)Zr i y 
reflections (so 0(1/Urr/) of them) that need to be applied to find the element i. The correctness of the search 
algorithm then guarantees that the trace distance between the final state and the target state is a (small) 
constant c, so 1/2|| | i') (i'\ — W to t |7r) (7t| W/ ot || < c. But since the trace distance (and also fidelity) are preserved 
under unitary maps, and since W to t is unitary, we also have that 1/2|| W} ot \i') (i'\ W to t — |tt) (7r| || < c. Thus the 
resulting state obtained by reversing the search process is constantly close to the state |7r). But then, the |7r) 
projection measurement we described previously will recover (an arbitrary good approximation of) the |7r) state 
with a constant probability. By iterating this entire process, should it fail (the iteration is possible, since we 
can generate |?') cheaply on demand), we will get the desired state |7r) with exponentially increasing probability 
in the number of attempts. 

Such a process of recovering the state |7r) corresponds to a classical mixing process. Classical mixing (for 
time-reversible Markov chains) can be achieved in 0(1/5 x log(l/7r m j„)) (ignoring error terms), whereas the 
quantum process terminates in 0(1/y/5 x 1/y/ir min) 1 ■, where 7r m j„ denotes the smallest occurring probability in 
7 r, in the worst case. Hence we can see a quadratic improvement w.r.t the 5 term in the quantum case. However, 
the scaling relative to the probability term 7r min constitutes an exponential slowdown relative to the classical 
mixing bounds, and this trade-off is prohibitive. 

We highlight that the approach we have just described for attaining stationary distributions by running hitting 
algorithms in reverse was first proposed by Richter [7] 7 8 , extending on observations by made by Childs [7, 27]. 

The basic idea of this work will be to ensure that the choice of the initial seed state |«) is in fact the best 
possible. However, the best possible situation can still be to costly as the highest probability may still be as 
small as 1/-/V, as is the case for the uniform distribution. In these cases there is a more efficient way to prepare 
the initial state, which we clarify next. 


E. Preparation from the uniform distribution 

As we have described previously, the access to the W(P ) 9 operator allows us to perform a projective 
measurement to the state |7r). Thus, if we prepare the coherent encoding of the uniform distribution state 


7 We are ignoring the logarithmically contributing precision term log( 1 / error) in both cases. 

8 The approach to quantum mixing we outline here was developed before the authors were aware of the observation by Richter, 
and independently from the paper [7]. During a more extensive literature review, the cited paper by Richter was identified as 
the, to our knowledge, the first paper to outline the idea as a comment in the preliminaries section. 

9 More precisely, we require a controlled variant of the W(P) operator. 
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| u) = Up (1/vN I*) |0)J , simply by performing the 17r) projective measurement on it, we still have the prob¬ 

ability F(\u) , 17r)) = | (u| 7r)| 2 of collapsing to the correct state. By repeating this process until we succeed, we 
obtain a preparation algorithm with expected running time of 0(l/y/S) x 0(1/F(\u) , |7r))). However, we can 
improve on this by “Goverizing” this process, that is, by using amplitude amplification [26]. This amounts to 
reflecting over |it) and 17r) iteratively, starting from |it) , until we reach a state close to the target state |7r), with 
an overall cost 0(1/\/S) x 0(1/F(\u) , |7r))). 

We use the generalizations of this approach in [28] to generate coherent encodings of stationary distributions 
in the cases where the shape of the target distribution is to some extent known. For the purposes of this paper, 
however, we will only require unsearching from the uniform and from Kronecker-delta distributions. 

The preparation method starting from the uniform distribution, and also the unsearch approach from a fixed 
state, are a special cases of the more general amplitude amplification protocol we have just described. 

The two methods, unsearching and preparation from uniform, of preparing the state |7r) are complementary, 
in the sense that the latter method is more efficient when the stationary distribution is close to uniform, where 
the unsearching becomes efficient when an element has a high probability (roughly, when the distribution is 
far from uniform). Our overall approach we present next will use both methods for preparation, and provide 
a method for identifying the right candidates (elements with the highest probability in 7r) for the unsearching 
approach. In what follows, we will say that the (coherent encoding of) distribution n is far from uniform, if 
Fd^) , |u)) > 1 /V~N, and otherwise, we will say the distribution (equivalently, its coherent encoding) is close to 
uniform. 


IV. THE PROTOCOL 

We will first establish the notation for the remainder of the paper. A given element of a sequence we are 
referring to, will, in the remainder of the paper, be specified by a superscript in the cases of transition matrices 
and spectral gaps, e.g. P t , 5 t for the t th element. In the case of distributions, we will use parentheses (e.g. n(t)), 
since we have reserved the subscripts to denote a particular probability in a given distribution. 

We proceed by formally specifying the setting we consider. We assume we are, at each time-step t, given the 
Szegedy walk operators W(Pt ), associated with a sequence of time-reversible Markov chains {Pt}//Lr over the 
same state space of N elements, along with each spectral gap <5fc 10 - 

The task is, at each time-step t, to generate the coherent encoding of the stationary distribution |7r(t)), with 
cost in 0(N 4 / 4 / 

To achieve this, we require further assumptions, namely that the Markov chains are slowly-evolving. More 
precisely, we require that the stationary distributions n(t),Tr(t + 1) of neighboring Markov chains Pt,Pt+ i, 
respectively, are sufficiently close in terms of the fidelity of their coherent encodings. That is, we require that 
F(\n(t)} , |7r(f + 1))) > r], where r] > 0 is a real constant independent from the spectral gaps, and the state 
space size. Moreover, we will require that the spectral gaps St, <5t+i of neighboring chains Pt,Pt+i are relatively 
close, in the sense which we will specify later. As we will explain, this last assumption is not vital, but allows 
for a more convenient statement of the main result. Finally, we will assume that the coherent encoding of the 
stationary distribution |7r(1)) of the first Markov chain is easy to generate. 

These assumptions are essentially equivalent to the assumptions in [11, 12]. However, as we have clarified, in 
contrast to those works, in our result, at each time-step t, the stationary distribution can be prepared de novo , 
that is without using any quantum memory from step t — 1, with cost 0(N 4 ^ 4 /'/St). This, for instance implies 
that multiple copies can be generated at each time-step as well, if desired, without having to re-run the entire 
sequence of Markov chains. Moreover, our approach does not depend on the length of the sequence, as each 
stationary distribution is prepared “on the fly”, independently from the quantum states utilized in previous 
steps. Both properties are vital in the context of active learning agents that we have mentioned previously. 

To explain how our protocol works, we will describe two particular settings where the cost of preparation 
of the encoding of the stationary distribution 17r) of an TV—state Markov chain P with spectral gap d is in 
d(N 1 / 4 /y/S). 

In the first setting the fidelity between the coherent encoding of the uniform distribution |u) and |7r) is above 
N~ 1 / 2 . In this case, as we have shown, the preparation starting from uniform has the desired overall cost 

d(F(\u),\Tr))- 1 / 2 5- 1 / 2 ) = d(N 1 / 4 V6). 
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Effectively, we only require a sensible lower bound on the spectral gap. 
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In the second setting the stationary distribution n of P has the probability of a mode n max (the largest 
occurring probability) larger than TV -1 / 2 , and the mode state itself i ma x is known. In this case, unsearching 
from the element i max will produce the target state with cost in 0(1/y/S x 1/ y/^ max ) = 0(N 1 I 4 \J1>). 

Our first technical result shows that any Markov chain P fits in one of the two settings above, which is 
captured by the following Lemma, proven in the Appendix. 

Lemma 1. Let n be a distribution over N states, such that F(\u ), |7r}) < 1 /y/N. Then max^ 7Ti > 1 /y/N. 
Moreover, i/maxj 7 Tj < 1 /y/N, then Adit), | 7 r)) > 1/y/N. 

The lemma above has a few immediate consequences. First of all, if we are given a Markov chain P , (over N 
states) with known < 5 , the mode i ma x of the corresponding stationary distribution 7 r, along with the probability 
of the mode 7 Ti mQx , then it is clear that we can prepare the stationary distribution within cost 0(N 1 / 4 /y/5): 
if 7q > 1/y/N , we use the “unsearch from |i)” approach. If it is not, then by the second claim of Lemma 1, 
we know that we can prepare the initial state by the preparation from the uniform distribution within cost 

0(N 1 / 4 /V6). 

It is also easy to see that the assumption of knowing the probability of the mode 7 q is actually not needed. 
One can first attempt the preparation from the uniform distribution a suitable number of times, where the 
number of reflections used is upper bounded by 0(N 1 / 4 ) 11 - if the target distribution closer than 1 /y/N to the 
uniform distribution, in terms of the fidelity, then this will succeed with exponentially high probability in the 
number of attempts. If all attempts fail, we are (except with exponentially small probability) then sure we are 
in the regime where the mode has a probability higher than 1/y/N, and this is all we need to know. Then, the 
unsearching approach, starting from the mode i ma x will (with high probability) produce the target state if we 
employ 0(N 1 / 4 ) iterations, so with overall cost (/(N 1 / 4 /yfS). We will take care of the failure probability of this 
approach later. 

However, even the assumption that the mode (but not the probability of the mode) is known is most often 
too strong to be justified. Nonetheless, if we are dealing with a scenario in which we have a sequence of Markov 
chains, such that a) the stationary distributions of consecutive Markov chains are sufficiently close, and b) the 
first Markov chain has a known, easy to prepare stationary distribution, then we can recover the same results 
without the need to explicitly find a mode. 

To illustrate how this is achieved, consider the setting of just two Markov chains, P\ and P 2 , (with corre¬ 
sponding stationary distributions 7 r(l), 7 t( 2 ), such that | 7 r(l)) is easy to prepare. By easy to prepare we mean 
within the cost 0(N 4 ^ 4 /y/Si), so it will, for instance, suffice that we know the mode of 7r(l) and it is above 
1/y/N, or that the fidelity (relative to the uniform distribution) is above 1/y/N. 

To prepare the (coherent encoding of the) stationary distribution of P 2 , we first proceed with the attempt to 
recover it by unsearching from the uniform distribution. If this succeeds, we are done. If this approach should 
fail, we proceed as follows: we first prepare d £ INI copies of the state | 7 r( 1 )), where d is a (small) confidence 
parameter. Recall, we have assumed the stationary distributions of Pi and P 2 are close, so we will have that 
P(| 7 r(l)), |tt(2))) > 77 , where 77 is some (large) constant. This implies that a projective measurement on the 
state | 7 t( 2 )) of the state | 7 r(l)) will succeed with average probability 77 . This measurement has cost (/(l/y/S/), 
so with overall cost 0(c'/y/S 2 ) we can prepare, on average, c = rjc' copies of the state | 7 r( 2 )) 12 . In the actual 
protocol, we will iterate the preparation until we do have c copies, and d above then establishes the expected 
number of iterations. 

Next, we simply measure (the first register of) all of the c copies of the state, obtaining c independent single 
element samples from the distribution 7 r( 2 ). As it turns out, this is sufficient for the task at hand. If the fidelity 
of | 7 t( 2 )) , relative to the uniform distribution state |u), is below 1/y/N 13 , then with probability 1 — 2 -c , at least 
one state i, out of the c independently sampled states, has the corresponding probability 7r(2)$ > 1/(4 y/N). 
This result is captured by the following Lemma, and proven in the Appendix: 


11 More precisely, we would use the use a randomized approach as presented in [29], which only requires a lower bound. We note 
that the approach of [29] can be applied if a lower bound is known, but also the upper bound should not surpass 1/4. This is 
achieved by directly performing the |tt) projective measurement on the uniform distribution state a couple of times. If it succeeds, 
we are done, should it fail, we can conclude that the overlap is below 1/4, as required, except with an exponentially decaying 
probability in the number of attempts. The same approach, albeit applied to the task of element finding, was first suggested in 
[13] . 

12 We note that if 77 is very small (but the assumption is it is independent from N and the spectral gaps) we can do better by 
utilizing quantum amplitude amplification [26] again - given the initial state | 7 r( 1 )), by using the reflection over it, and the 
reflection over 1 7 r(2)), we can obtain the target state 1 7 r( 2 )) with a quadratic smaller cost with respect to 77 . However, since for 
this work we assume 77 is constant this still yields the same overall scaling. 

13 This is the case, except with very small probability, since we assume that the approach of preparation from the uniform distribution 
had failed. 
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Lemma 2. Let n be a distribution over N states, and let F(\n) ,\u)) < 1/y/N. Then there exists a set of indices 
S C {1,..., N} such that the two following properties hold: 

• minig 5 7 Ti > — -z= and 

4 y/N 

• P(S) = E ies^i > \- 

As the next step, we simply sequentially attempt to prepare the target state through unsearching from the 
sampled states and employing 0(A' 1 / 4 ) iterations of the reflections. With probability at least 1 — 2 _c , one of 
the attempts will succeed. 

What we have shown is that having a collection of c independent single element samples from 7 t( 2) suffices to 
efficiently prepare 17 r( 2 )) , in the regime where the preparation from uniform distribution would not be efficient. 
From these observations, the presented approach for two Markov chains inductively extends to the setting with 
a sequence of Markov chains that we wish to consider. We now give the full protocol, along with a more rigorous 
analysis. 

In what follows, we will be assuming all the approximate reflection operators are in fact exact, and we will 
deal with the errors induced by approximations later. The protocol will use two subroutines. The subroutine 
PrepareFromUniform(c) attempts the preparation from the uniform distribution, using C^iV 1 / 4 ) reflections. If 
the target distribution state close to the uniform distribution state (in the sense we defined previously), then 
by utilizing the randomized approach [29] we will obtain the target state except with probability below 1/2 14 . 
We will, for this subroutine, allow for c attempts to prepare the target distribution. Then we will succeed, 
whenever the fidelity relative to the uniform distribution state is above IV -1 / 2 , except with probability 2 -c . 
The output of this subroutine is either the coherent encoding of the stationary distribution, or “unsuccessful” 
- a flag indicating that the preparation failed and that the target distribution is far from uniform, except with 
small probability. The cost of this procedure is 0(c V 1 // 4 / y/Sf), at time step t. 

The second is the Preparesamples(c) subroutine. In the context of the overall protocol, we will make sure 
that, at each step we generate in total c elements sampled from the target distribution. One of these is output, 
and all are saved, in the case we need them for the next step. The PrepareSamples(c ) subroutine, used at 
time-step t > 2 , back-tracks to the previous step, and first prepares the coherent encoding for the previous step 
| 7 r(t — 1)). Depending on whether the previous stationary distribution is close or far from the uniform (that 
is, closer or further than 1/y/N, in terms of the fidelity with the uniform distribution) for this we may require 
c samples from the previous distribution itself. As we have clarified, we will make sure we always have those 
in the overall protocol. Given the c samples for the previous step, the encoding \n(t — 1)) can be generated 
with cost 0(cN 1 / 4 /^/5t-i), except with probability 2 -c by Lemma 2 (in the case we accidentally have bad 
samples), either by using the preparation from the samples, or by preparing from the uniform. Following this, 
on the state \n(t — 1 )) we apply a \tt (£)) — projective measurement (with cost 0(l/y/St)) and with probability 
77 we succeed to project onto | 7 r(t)). This process is repeated until c copies of |7 t(£)) are generated, and they 
may be immediately measured. One of the sampled elements (measurement outcomes) is output, and the other 
c sampled elements are stored for future use by the Preparesamples subroutine. 

The situation is analogous in the case the previous distribution was prepared from the uniform. 

We highlight that, irrespective of the method we used at time step t — 1, Preparesamples{c) will attempt 
to regenerate the states \n(t — 1 )) by using the original approach first, but, if that should fail, it will switch to 
the alternative 15 . 

In the case Preparesamples(c) is run at time-step t = 2, the procedure is analogous as above, with the 
difference that, by assumption, we can cheaply generate the required encodings |7r(l)) of the previous step. 

This subroutine has expected running time 0(77 c V 1 / 4 / y/S t ~i), and a failure probability 2 -c . Since we do 
not consider the scaling in 77 , we obtain 0(cN 1 / 4 / y/5 t -i). 

Now we can give the protocol, where t denotes the time-steps: 


14 Here we again, as a technical point, assume that we have eliminated the possibility that the overlap between the uniform 
distribution and the target distribution is over (or equal to) 1/4, by attempting direct projective measurements first. For this, it 
will suffice to attempt the projective measurement 3c times - failing to generate the target state, if the fidelity is above or equal 
to 1/4, will then occur with probability below (3/4) 3c < 2~ c . Since the cost of the projective measurements does not depend on 
N, we may ignore this in the complexity analysis. 

15 Note note that switch from the samples approach to the preparation from uniform approach is always possible, and the switch 
from the uniform to the samples approach is possible because we will always, regardless of the regime, prepare and store c 
independently sampled elements. 
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The protocol 

1. If t = 1, prepare the corresponding coherent encoding of the stationary distribution, measure, and output 
the outcome. Keep the operator W(P\) (and <5i) in memory for one additional time-step. 

2. If t > 1, execute PrepareFromUniform(2c), c times. If each run generated the target distribution, save c 
sampled elements for future use, and output one as the current output. If any run returns “unsuccessful”, 
abort, and run Preparesamples(c). In both cases replace the stored operator W(Pt-i) with the current 
W(P t ) (also with S t ,) and proceed to the next time-step. 


A. Protocol analysis 

First, we analyze the protocol under the assumption that the realized approximate reflection operators are 
perfect. In this case, the protocol above has, at each time-step t (for t > 1,) the expected running time in 
0(2c 2 7V 1 / 4 / 1k /min{(I t _i, S t }) = 0(c 2 AT 1 / 4 /min{<5 t _i, <j 4 }), where c is a confidence parameter, as this expres¬ 
sion is the maximum of the costs of both possible preparation subroutines. 

If we have the assumption that the neighboring spectral gaps ch_i and S t are multiplicatively close, meaning 
that there exists a constant k £ R + (independent from N) such that for all t > 1 we have that 

5 t -i/n <S t < n6 t _ i, (7) 

then the cost of preparation is in 0(c 2 N 4 / 4 /\/5t) for each f, which was the desired cost. The protocol can, 
however, fail with probability 0{ 2 _c ), which we clarify next. First, note that the PrepareFromUniform(c) 
subroutine may fail - that is, report ’’unsuccessful”, although the distribution is in the right regime (close to 
uniform). In our protocol, we call this subroutine c times, with parameter 2c. This entire iteration fails if at 
least one of the runs reported “unsuccessful”, although the target distribution was close enough to the uniform 
distribution. If the target distribution is in the required regime, PrepareFromUnif orm{2c ) run once reports 
“unsuccessful” with probability 2 -2c . The probability at least one “unsuccessful” report in a sequence of c runs 
is then given with 1 — (1 — 2 _2c )° = 1 — (1 — 4 -c ) c . However, we have that 1 — (1 — 4 -c ) c < 2~ c , which we here 
prove for completeness. We have that 

1 - (1 - 4~ c ) c < 2~ c <S> (1 - 4T C ) C > 1 - 2~ c . (8) 

For the expression (1 — 4 -c ) c we have, by the Bernoulli’s inequality, that (1 — 4 -c ) c > 1 — c4 _c , so it will suffice 
to show that 1 — c4 -c > 1 — 2 -c , which is equivalent to c < 2 C , which is true. 

Thus in our protocol, failure to prepare the required c independently sampled elements, in the case the 
distribution is sufficiently close to the uniform distribution, occurs at most with probability 2 -c . 

If the distribution is not close to uniform, we may end up running the PrepareSainples{c) subroutine, which 
will attempt the preparation of the c samples, by regenerating the encodings of the stationary distributions of 
the previous step. For this, it may utilize either the c samples from that distribution or attempt preparation 
from the uniform distribution state, and in the worst case, it will attempt both. Since the target distribution 
must be in one of the two regimes, and since both cases have a failure probability of 2 -c , this also gives the 
overall failure probability. 

Hence, we have shown that our protocol, under the assumption that all the reflection operators (and mea¬ 
surements) are perfect, generates a sample from (or a coherent encoding of) the target stationary distribution, 
with cost in O(o 2 N 1 / 4 5 t ), with a failure probability in 0(2~ c ). 

In the real protocol, the reflection over the target state \i r) is not ideal (as we only achieve an approximation 
of the reflection) and neither is the 17r) projective measurement. Taking into account the effects of these 
imperfections, we obtain the expected run-time of 0(c 3 N 1 / 4 /y/St), with the same failure probability in 0(2~ c ). 
Analysis of this is provided in the Appendix. 

We finish of this section with a comment on how total failure can be dealt with, when failure is not an option. 
In the context of (effectively) infinite sequences of Markov chains, the exponentially unlikely failure will still 
occur. In this case, if we are required to proceed although the protocol failed at time-step t, one can always 
prepare a sufficient number of samples from |7r t ) in time 0(N 4 ^ 2 / v^t), by forcing the preparation from the 
uniform distribution. Although this constitutes a quadratic slowdown (w.r.t. the state space size), it will only 
occur exponentially rarely, which means that, at least the average preparation cost for each time step can be 
kept arbitrarily close to O^N 1 / 4 /^^). 
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V. DISCUSSION 

We have presented a quantum algorithm for sequentially generating stationary distributions of an arbitrar¬ 
ily large sequence of Markov chains. The quantum algorithm outperforms classical approaches whenever the 
spectral gaps <5 of the Markov chains are below 1/y/N, where N is the size of the state space. In contrast, 
straightforward application of the “mixing by reverse hitting” approach would yield improvements only in a 
quadratically more stringent regime where S < 1/N. The basic observation we have used is that the bottle-neck 
of direct mixing by running hitting algorithms in reverse, can be ameliorated when only a small number of 
elements sampled from the target distribution are available beforehand. We have shown that this can guarantee 
that the initial state of the unsearch approach is far from the worse case setting. Following this, we have shown 
how these samples can be made available in the context of slowly evolving Markov chains. As we have clarified, 
the presented algorithm has an immediate application in a recent approach to (quantum) artificial intelligence 
[20], but it may be useful in other context as well. For instance, it may offer improvements for problems 
stemming from statistical physics. One application could be in the case when strictly independent samples 
from Gibbs distributions of physical systems are required in a large range of temperatures, which include the 
computationally difficult low-temperature regimes. Other applications may be possible as well, for instance in 
applications where subsequent Markov chains may depend on the actual outputs of previous mixing steps. In 
this case, quantum-enhanced classical annealing methods become unsuitable, as they need to keep coherence 
through the protocol steps [19]. 

As a feature of our protocol, we point out that at each time step can output not just a classical sample 
from the target stationary distribution, but a coherent encoding of this distribution. This is not a guaranteed 
characteristic of quantum mixing protocols [7], and makes our approach suitable for combining with other 
quantum protocols which start from such a coherent encoding [2, 13, 20]. 

In the protocol we have presented, as in other related works, it is always assumed that aside from the Markov 
chains themselves, one also has access to the values of the spectral gaps. This is potentially a problematic 
assumption, since, at least in the general cases, spectral gaps are often difficult to determine. Consequently, 
methods which do not rely on good lower bounds of the spectral gaps, or, more precisely, which can adaptively 
estimate the changes in spectral gaps in the context of slowly evolving sequences, are part of ongoing work. 
Acknowledgments: 
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VI. APPENDIX 


In this section we prove the technical lemmas from the main body of the paper, which we repeat for the 
benefit of the reader. Following this, we provide an analysis of our protocol covering the imperfections in the 

reflection operators. _ _ 

Lemma 1. Let ir be a distribution over N states, such that F'(ltt), |7r}) < 1 /VN. Then maxi7r.; > 1/y/N. 
Moreover, «/maxj7Ti < 1 /y/N, then F(\u) , |7r)) > 1 /VN. 


Proof. Assume first that max^ ni < 1/y/N. Then, we ask what distribution 7 r minimizes the fidelity, relative to 
the uniform distribution, satisfying the given constraint on the mode(s). We claim that the distribution which 
minimizes the fidelity is the distribution 7 r (up to permutation of the probabilities, which does not change the 
overlap with the uniform distribution) defined as follows. 


Let pmax = 1 /y/N and k 


1 

Pmax 


For all i such that 1 < i < k we set 7 r, = 7 r max . Furthermore, we set 


TTk+i = (1 — k Pmax)- Finally, for all remaining states we set Tti>k+i = 0. 

To see this is the case, first note that the permutation of the probabilities does not change the overlap with 
the uniform distribution. Thus it will suffice to consider distributions whose probabilities are ordered in a 
decreasing order according to the indices. We will call such distributions decaying distributions [28]. Next, 
we will say that the decaying distribution p is obtained from the decaying distribution 7 by separating the 
probabilities of elements i and j in 7 , (for i < j) if the following holds: 7 k = Pk for all k / i and k / j, and 
7i < pi and 7 j > pj. Intuitively, to obtain p from 7 we simply shift a part of the mass of the probability at 
state j to the state at i while maintaining the order. Next, note that the distribution 7 r is the extreme point of 
such a probability separation process, for all decaying distributions satisfying the constraint on the probability 
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of the mode: 7 r can be obtained by iterating this process from any decaying distribution <r, which satisfies the 
constraint max^ cr, < 1 /y/~N. 

For completeness, we illustrate why this works. For instance, by starting from the smallest non-zero proba¬ 
bility element i in a, decreasing it, while increasing the largest probability element in j in the distribution a 
which is smaller than p ma x until the modified value of 07 equals p m ax, or until we deplete cr,;. By iterating this 
procedure, in a finite number of steps we will have reached 7 r. 

Next, we claim that if the decaying distribution p is obtained from the decaying distribution 7 by separating 
the probabilities of elements i and j, then F(| 7 ), |n)) < F(\p) , \u)). This follows from the convexity of the 
fidelity relative to the uniform distribution: since we are only changing the probabilities of the elements i and 
j, the distance from the uniform distribution (fixing all other parameters) is up to squaring proportional to 
/ {jpi , Pj ) = yfpi + y/p], where Pi + pj is constant. This function clearly decreases as pi grows at the expense of 
P.v 

But since 7 r is the extremal point of the process of separating the probabilities (under the constraint that 
Ttmax < Pmax), the distribution 7 r as defined minimizes the fidelity under the given constraint on the mode of the 

distribution. The fidelity between | 7 r) and |u) is now easy to compute: We have that F(| 7 t) , | u)) = — | 

and we will evaluate f(n ): = ]T) ■ 

We have that 


/O) 


1 

Pmax 


Vp 


max 



1 

Pmax 


Pmax • 


(9) 


This expression can be further simplified. In the following, let for x £ R + , { 2 ;} = x— [ ;r J denote the fractional 
part of x. Then we have: 


.Pma 


y/Pri 


+ \ 1 — 


_Pma 


Pma 


= (; 


Pma 


{ })\JPmax H“ \/l ( 


Pm> 




Pmax Pma 


~})Pn 




Vp™ 


- {^—}Vp max max 

Pmax V Pmax 

+ y/Pmax ( { } ~ {-})• 

V Pmax Pmax 


( 10 ) 

(11) 

( 12 ) 

(13) 


Since the fractional part is always between 0 and 1, and since on that interval it holds that y/x > x, we have that 
the expression (»/{——} — {——}) is always non-negative, the minimum is zero, and the maximum reached 

K \ X Pmax X Pmax 

at x = 1/4 where it reaches the value 1/4. Thus we have /( n) > 1 /y / p m ax so 




(14) 


This proves the second direction of the lemma. 

By taking the contrapositive of the second direction we immediately obtain 

F(\u),\tt))<1/VN^ max 77 > _ (15) 

* vN 

For the case that F(|u) , | 7 r)) = 1 /V~N, by similar arguments as before, we get 7 r max > 1 /y/N, so the Lemma 
holds. □ 

Next, we prove Lemma 2. For convenience we will rephrase it in terms of the function / defined as /( 7 r): = 
\[Fi, which is up to a square proportional to the fidelity w.r.t. the uniform distribution: 


^(K)») = 


(16) 


Lemma 2 (rephrased). Let it be a distribution and let /( 7 r) < N 1 ^. Then there exists a set of indices S C 
{1,..., N} such that the two following properties hold: 
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min ie s 7 Ti > 


1 


4 VN 


and 


1 


«es ni - 2 • 


4\/iV 


and/or 


• W = E ie 

Proof. Assume the lemma does not hold, that is, for every S C {1,...,JV} either mini 6 s7Ti < 

Eies 71 ’* < l/^- 

Let S be the set of all the indices of all probabilities occurring in 7r, which are larger or equal to Note 

that by Lemma 1, since f{i r) < N 1//4 ■£=> F(|u) , |7r)) < l/v//V, there exists at least one probability larger or 
equal to -^=, thus the set & is non-empty and P(S) > 0. 

For this lemma to be false, it then must hold that EieS 7r * < ^ • 


But then, for the complement set of indices S c = {1,..., N} \ S the following holds: 

p (s c ) = E ^ v 2 ’ 

ies c 


and 


(17) 


max7r, < —=. 
*eS c 4V-/V 


Note that, by the assumptions of the Lemma it holds that 

Ev^+ E v ^< n i/4 , 

ies ies c 


(18) 


(19) 


and, as we have seen, Eies > 0- 

Now, consider the renormalized distribution i r, where all probabilities corresponding to elements in S are 
set to zero. By Eq. (17), the renormalization factor is below 2. Then, since max ieS c 7Tj < it holds 

that max,; iq < -——. Finally, we proceed analogously to the proof of the second direction of the first Lemma 


2y/N' 


1 


(Eq. (9) to Eq. (14)) to find a bound on the r) function under the constraint that max; j f* < —-=. We obtain 


2 VN' 


/W > 


^1/(2 VN) 


= yfiVN, 


( 20 ) 


which implies that Eie5 c — -=f(n) > N 

v 2 

Since f(ir s ) > 0 (strict inequality) we have the desired contradiction with Eq. (19) since N 1 / 4 + /(7r s ) is 
strictly larger than N 1 / 4 . □ 

a. Analysis for imperfect reflection operators Here we consider the propagation of errors when the reflec¬ 
tion operator over the stationary distribution, and the 17r) projective measurement, are approximate. Recall 
that both in the cases of the preparation from the uniform distribution, and in the cases of preparation from 
a given sampled element i, the precision of the approximation of the target state comes into play only loga¬ 
rithmically. More precisely, if e is the desired bound on the trace distance between the realized distribution 
and the targeted distribution, and if f is the fidelity between the initial state (uniform distribution or the 
given sample state |i)), and the target state, then the total cost of the preparation procedure is given with 

O ^\/ S~ 1 y/p~ 1 ^log (e -1 ) + log (\/5 iri "))) • the last expression, the second log term compensates for the 

fact that an imperfect reflector will be applied \Zf,~ 4 times, accumulating errors 16 . 


16 We note that the errors stemming from the iterations of the approximate reflection operator can be further suppressed using 
more elaborate techniques, see [13] for further details. 
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Thus, the precision of the approximation contributes only logarithmically in the overall complexity, even 
in the iterated setting. However, we must make sure that the inductive steps of our protocol, going from 
one time step to another, are not overly sensitive to small imperfections. There are two moments where the 
imperfections can cause problems. First, except for the first time-step, the c samples we have stem not from 
the exact distribution, but rather the approximation. Second, in the generation of the c samples at step t we 
used an approximate projective measurement to go from an approximation of \n(t — 1 )) to an approximation 
of | 7 r(f)), which succeeds with probability 77 (the fidelity between the two states), only in the exact case. For 
the second problem, a simple way to bound the deviation on the success probability is by considering the ideal 
| 7 t) projective measurement as a completely positive trace-preserving (CPTP) map £\^ which outputs just the 
success or failure status (since we care only about the perturbations of the success probabilities). So 

£ \n (t)>(k(* - 1 )) (tt (t - 1 )|) = 77 1ok) (ok| + (1 - 77 ) |fail) (fail). ( 21 ) 

The approximate projective measurement (precise within e) can be represented in the same way by the map 
£^, and we have that 

l/2\\£fo(p)-£ M (p)\\<e ( 22 ) 

for any state p, where || • || represents the standard trace norm on quantum states. The above holds for any 
pure state p, so we get the above by triangle inequalities for arbitrary states. We point out that the claim 
holds when complete maps (which also output the heralded quantum state, not just the success/failure bit) 
are considered, but as tracing out only reduces trace distances this claim also holds. Note that we do not 
need to consider purified systems (nor completely bounded norms on the maps), for our problem. Then if 
£\n( t )) (I 71 "(t — 1 )) ( 7 r (t — 1 )|) = cr |ok) (ok| + (1 — a) |fail) (fail|, we have that 

V 2 II £\n(t))(Wt - !)) <tt(* - 1)1) - £\ n(t)) {\Tr{t - 1)) (tt (t - 1)1)11 = |?7 - a\ (23) 

but then also \rj — a\ < e. In the following, let p K (t-i) denote the e—close approximation of \n(t — 1)) (in 
the trace distance), and let rj be the success probability of the approximate projection measurement on the 
approximation p w u-i)i so 

1 )) = n' I ok) (ok| + (1 - 77 ') | fail) (fail | (24) 

Then we have that 

\V~V'\ = 1 / 2 l|£| e 7 r(t)>(P^(t-l)) ~ ~ !)) <tt(* — 1)|)|| (25) 

and then by adding and subtracting (t — 1 )) (n(t — 1 )|), and by the triangle inequality we obtain 

|t7 -?/| < e + l/2||£T| e jr(t)> (p 7r(t _ 1) ) - £^ {t)) {\i^{t - 1)) < 7 r(i — 1)|)||, (26) 

which by the contractivity of CPTP maps yields 77 — rf < 2e. Then by setting e to 77/4 we get that if rj < 77 
( which is the problematic case) then v{ > 77 / 2 . In other words, as long as we make sure the error is below 
77/4 (which is still a constant), we are sure that the success probability of the approximate measurement on the 
approximate state is in the worse case halved. This constitutes only a a constant multiplicative increase in the 
run-time of our protocol, so the overall complexity expression is unchanged. 

The other problem we face in the light of the approximate nature of the operators we use, is that the c 
sampled elements we obtain do not stem from the distribution 7 r, but an e—close approximation (in terms of 
the trace distance). To analyze the worst case scenario how this influences our protocol, we shall employ similar 
arguments as above. Note that the “preparation from c samples” subroutine can be viewed as a CPTP map 
applied on c mixed states, all encoding the underlying probability distribution, which outputs success (heralds 
that the preparation succeeded) , except with probability 2 -c , if the target distribution is in the right regime, 
i.e. far from uniform. The c mixed states are obtained by computational basis measurements of the ideal 
coherent encoding of the target probability distribution |-7r (t)) . In the non-ideal case, we have as input c mixed 
states obtained by a computational-basis measurement of c approximations, which are within e distance from 
the ideal states. Since the trace distance can only decrease by measurements, and by its subaditivity w.r.t. 
tensor products, the total inputs, in the ideal and non-ideal case, differ by at most ce (in the trace distance). 
But then the output of the procedures (hence, also the success probability) cannot differ by more than ce. Thus 
we obtain that the failure probability for the non-ideal case is no greater than 2 -c + ce. If we set e = 2 -2c , the 
failure probability is lower bounded by 2 -c+1 , which is obeys the same scaling. Since the error term e appears 
logarithmically in the overall complexity, we get an additional multiplicative pre-factor of log(2 2c ) which is in 
0(c). 
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Then, the worst case complexity of our approach is given with 0(c 3 VS 1 N 1 / 4 ), with failure probability 
2~ c+1 . By adding one to all confidence parameters of the protocol, since (c-t- l) 3 £ 0(c 3 ), we obtain the cost in 
0(c 3 y/ 5 _1 7V 1 / 4 ) and the same failure probability, as for the ideal reflectors case, of 2~ c . 
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